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ABSTRACT 

A key problem of robotic environmental sensing and moni- 
toring is that of active sensing: How can a team of robots 
plan the most informative observation paths to minimize 
the uncertainty in modeling and predicting an environmen- 
tal phenomenon? This paper presents two principled ap- 
proaches to efficient information-theoretic path planning based 
on entropy and mutual information criteria for tn situ ac- 
tive sensing of an important broad class of widely-occurring 
environmental phenomena called anisotropic fields. Our pro- 
posed algorithms are novel in addressing a trade-off between 
active sensing performance and time efficiency. An impor- 
tant practical consequence is that our algorithms can exploit 
the spatial correlation structure of Gaussian process-based 
anisotropic fields to improve time efficiency while preserv- 
ing near-optimal active sensing performance. We analyze 
the time complexity of our algorithms and prove analyti- 
cally that they scale better than state-of-the-art algorithms 
with increasing planning horizon length. We provide the- 
oretical guarantees on the active sensing performance of 
our algorithms for a class of exploration tasks called tran- 
sect sampling, which, in particular, can be improved with 
longer planning time and/or lower spatial correlation along 
the transect. Empirical evaluation on real- world anisotropic 
field data shows that our algorithms can perform better or 
at least as well as the state-of-the-art algorithms while of- 
ten incurring a few orders of magnitude less computational 
time, even when the field conditions are less favorable. 

Categories and Subject Descriptors 

G.3 [Probability and Statistics]: Stochastic processes; 
1.2.9 [Robotics]: Autonomous vehicles 

General Terms 

Algorithms, Performance, Experimentation, Theory 

Keywords 

Multi-robot exploration and mapping, adaptive sampling, 
active learning, Gaussian process, non-myopic path planning 

1. INTRODUCTION 

Research in environmental sensing and monitoring has re- 
cently gained significant attention and practical interest, es- 
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pecially in supporting environmental sustainability efforts 
worldwide. A key direction of this research aims at sensing, 
modeling, and predicting the various types of environmen- 
tal phenomena spatially distributed over our natural and 
built-up habitats so as to improve our knowledge and un- 
derstanding of their economic, environmental, and health 
impacts and implications. This is non-trivial to achieve due 
to a trade-off between the quantity of sensing resources (e.g. , 
number of deployed sensors, energy consumption, mission 
time) and the uncertainty in predictive modeling. In the 
case of deploying a limited number of mobile robotic sens- 
ing assets, such a trade-off motivates the need to plan the 
most informative resource-constrained observation paths to 
minimize the uncertainty in modeling and predicting a spa- 
tially varying environmental phenomenon, which constitutes 
the active sensing problem to be addressed in this paper. 

A wide multitude of natural and urban environmental 
phenomena is characterized by spatially correlated field mea- 
surements, which raises the following fundamental issue faced 
by the active sensing problem: 

How can the spatial correlation structure of an 
environmental phenomenon be exploited to im- 
prove the active sensing performance and com- 
putational efficiency of robotic path planning? 



The works of 11 12 13 have tackled this issue specifically 
in the context of an environmental hotspot field by study- 
ing how its spatial correlation structure affects the perfor- 
mance advantage of adaptivity in path planning: If the field 
is large with a few small hotspots exhibiting extreme mea- 
surements and much higher spatial variability than the rest 
of the field, then adaptivity can provide better active sens- 
ing performance. On the other hand, non-adaptive sampling 
techniques [2]|8j[l4] suffice for smoothly- varying fields. 

In this paper, we will investigate the above issue for an- 
other important broad class of environmental phenomena 
called anisotropic fields that exhibit a (often much) higher 
spatial correlation along one direction than along its per- 
pendicular direction. Such fields occur widely in natural 
and built-up environments and some of them include (a) 
ocean and freshwater phenomena like plankton density [g], 
fish abundance [23], temperature and salinity [22]; (b) soil 
and atmospheric phenomena like peat thickness 25 , surface 
soil moisture [26) , rainfall ^18] ; (c) mineral deposits like ra- 
dioactive ore 1191; (d) pollutant and contaminant concentra- 
tion like air 111, heavy metals |16| ; and (e) ecological abun- 
dance like vegetation density |9J. 

The geostatistics community has examined a related issue 
of how the spatial correlation structure of an anisotropic field 



can be exploited to improve the predictive performance of 
a sampling design for a static sensor network. To resolve 
this, the following heuristic design [25] is commonly used 
for sampling the anisotropic fields described above: Arrange 
and place the static sensors in a rectangular grid such that 
one axis of the grid is aligned along the direction of lowest 
spatial correlation (i.e., highest spatial variability) and the 
grid spacing along this axis as compared to that along its 
perpendicular axis is proportional to the ratio of their re- 
spective spatial correlations. In the case of path planning 
for k robots, one may consider the sampling locations of the 
rectangular grid as cities to be visited in a fc-traveling sales- 
man problem so as to minimize the total distance traveled or 
mission time ^15j. However, since the resulting observation 
paths are constrained by the heuristic sampling design, they 
are suboptimal in solving the active sensing problem (i.e., 
minimizing the predictive uncertainty). This drawback is 
exacerbated when the robots are capable of sampling at a 
higher resolution along their paths (e.g., due to high sensor 
sampling rate) than that of the grid, hence gathering subop- 
timal observations while traversing between grid locations. 

This paper presents two principled approaches to efficient 
information-theoretic path planning based on entropy and 
mutual information (respectively. Sections [s] and [4| criteria 
for in situ active sensing of environmental phenomena. In 
contrast to the existing methods described above, our pro- 
posed path planning algorithms are novel in addressing a 
trade-off between active sensing performance and computa- 
tional efficiency. An important practical consequence is that 
our algorithms can exploit the spatial correlation structure 
of anisotropic fields to improve time efficiency while preserv- 
ing near-optimal active sensing performance. The specific 
contributions of our work in this paper include: 

• Analyzing the time complexity of our proposed algorithms 
and proving analytically that they scale better than state- 
of-the-art information-theoretic path planning algorithms 
[s] |13| with increasing length of planning horizon (Sec- 
tions [s] and [4| ; 

• Providing theoretical guarantees on the active sensing per- 
formance of our proposed algorithms (Sections[3]and|4| for 
a class of exploration tasks called the transect sampling 
task (Section |2.1[ ), which, in particular, can be improved 
with longer planning time and/or lower spatial correlation 
along the transect; 

• Empirically evaluating the time efficiency and active sens- 
ing performance of our proposed algorithms on real-world 
temperature and plankton density field data (Section [5|. 

2. BACKGROUND 

2.1 Transect Sampling Task 




In a transect sampling task [14[ |24| , a team of k robots is 
tasked to explore and sample an environmental phenomenon 
spatially distributed over a transect (Fig. [T| that is dis- 
cretized into a r x n grid of sampling locations where the 
number n of columns is assumed to be much larger than 
the number r of sampling locations in each column, r is ex- 
pected to be small in a transect, and k < r. The columns 
are indexed in an increasing order from left to right. The 
k robots are constrained to simultaneously explore forward 
one column at a time from the leftmost column '1' to the 
rightmost column 'n' such that each robot samples one lo- 
cation per column for a total of n locations. Hence, each 
robot, given its current location, can move to any of the r 



Figure 1: Transect sampling task with 2 robots on 
a temperature field (measured in°C) spatially dis- 
tributed over a 25 m X 150 m transect that is dis- 
cretized into a 5x30 grid of sampling locations (white 
dots) (Image courtesy of [14]). 

locations in the adjacent column on its right. 

In practice, the transect sampling task is especially ap- 
propriate for and widely performed by mobile robots with 
limited maneuverability (e.g., unmanned aerial vehicles, au- 
tonomous surface and underwater vehicles (AUVs) 21 ) be- 



cause it involves less complex path maneuvers that can be 
achieved more reliably using less sophisticated on-board con- 
trol algorithms. In terms of practical applicability, transect 
sampling is a particularly useful exploration task to be per- 
formed during the transit from the robot's current location 
to a distant planned waypoint |10[ |24| to collect the most 
informative observations. For active sensing of ocean and 
freshwater phenomena, the transect can span a spatial fea- 
ture of interest such as a harmful algal bloom or pollutant 
plume to be explored and sampled by a fieet of AUVs being 
deployed off a ship vessel. 

2.2 Gaussian Process-Based Anisotropic Field 

An environmental phenomenon is defined to vary as a re- 
alization of a rich class of Bayesian non-parametric mod- 
els called the Gaussian process (GP) [20] that can formally 
characterize its spatial correlation structure and be refined 
with increasing number of observations. More importantly, 
GP can provide formal measures of predictive uncertainty 
(e.g., based on an entropy or mutual information criterion) 
for directing the robots to explore the highly uncertain areas 
of the phenomenon. 

Let D be a set of sampling locations representing the do- 
main of the environmental phenomenon such that each lo- 
cation X £ V is associated with a realized (random) mea- 
surement Zx (Zx) if a; is sampled/observed (unobserved). 
Let {Zx}xe-D denote a GP, that is, every finite subset of 
{Zx}xe'D has a multivariate Gaussian distribution f20'. The 
GP is fully specified by its prior mean fj,x = 'E[Zx] and covari- 
ance a^x' — cav[Zx, Zx'] for all x, x £ T>. In the experiments 
(Section [5]l, we assume that the GP is second-order station- 
ary, i.e., it has a constant •prior mean and a stationary prior 
covariance structure (i.e., Oxx' is a function oi x — x' for all 
a;, x' &T>), both of which are assumed to be known. In par- 
ticular, its covariance structure is defined by the widely-used 
squared exponential covariance function 



'fA'r\x-x')\+al5xx' (1) 



f^xx' = exp y-^^^ 

where and are, respectively, the signal and noise vari- 
ances controlling the intensity and noise of the measure- 
ments, M is a diagonal matrix with length-scale compo- 
nents £i and £2 controlling the degree of spatial correlation 
or "similarity" between measurements along (i.e., horizontal 
direction) and perpendicular to (i.e., vertical direction) the 
transect, respectively, and Sxx' is a Kronecker delta of value 
1 if 2; = a;', and otherwise. For anisotropic fields, £1 £2- 
An advantage of using GP to model the environmental 
phenomenon is its probabilistic regression capability: Given 



a vector s of sampled locations and a column vector Zs of 
corresponding measurements, the joint distribution of the 
measurements at any vector u of k unobserved locations 
remains Gaussian with the following posterior mean vector 
and covariance matrix 

f^u\s ~ + SusSgs (Zs /is) (2) 

^uu\s ~ ^uu ^us^ss (3) 

where {fJ,s) is a column vector with mean components 
fix for every location x of u (s), Sus (Ess) is a covariance 
matrix with covariance components a^x' for every pair of 
locations x of u (s) and a;' of s, and Es^ is the transpose 
of E„s- The posterior mean vector /i„|s (|2| is used to pre- 
dict the measurements at vector u of k unobserved locations. 
The uncertainty of these predictions can be quantified using 
the posterior covariance matrix E„„|s ([3|, which is indepen- 
dent of the measurements Zs, in two ways: (a) the trace of 
Et,„|s yields the sum of posterior variances "S^xis over ev- 
ery location x of u; (b) the determinant of Eu„|s is used in 
calculating the Gaussian posterior joint entropy 
1 



H{Z^\Zs)^ -log(27re)"|E 



uu I s 



(4) 



Unlike the first measure of predictive uncertainty which as- 
sumes conditional independence between measurements at 
vector u of unobserved locations, the entropy-based mea- 
sure Q accounts for their correlation, thereby not overesti- 
mating their uncertainty. Hence, we will focus on using the 
entropy-based measure of uncertainty in this paper. 

3. ENTROPY-BASED PATH PLANNING 

Notations. Each planning stage i is associated with column 
i of the transect for i = 1, . . . ,n. In each stage i, the team 
of k robots samples from column i a total of k observations 
(each of which comprises a pair of a location and its measure- 
ment) that are denoted by a pair of vectors Xi of k locations 
and Zx^ of the corresponding random measurements. Let Xi 
denote the set of all possible robots' sampling locations Xi 
in stage i. It can be observed that x — ~ ■ ■ ■ = \Xn\ = 
'^Cfc. We assume that the robots can deterministically (i.e., 
no stochasticity in motion) move from their current locations 
Xi-i in column i — 1 to the next locations Xi in column i. 
Let Xi;j and Zxi.j denote vectors concatenating robots' sam- 
pling locations Xi,...,Xj and concatenating corresponding 
random measurements Zx^, ■ ■ ■ , Zxj over stages i to j, re- 
spectively, and Xi;j denote the set of all possible Xi;j. 

Maximum Entropy Path Planning (MEPP). The work 
of [13| has proposed planning non-myopic observation paths 
x1.„ with maximum entropy (i.e., highest uncertainty): 

(5) 



Xi., 



argmax H{Zxi-„] 



that, as proven in an equivalence result, minimize the pos- 
terior entropy /uncertainty remaining in the unobserved lo- 
cations of the transect. Computing the maximum entropy 
paths xl.„ incurs C'(x"(fc'^)'^), which is exponential in the 
length n of planning horizon. To mitigate this computa- 
tional difficulty, an anytime heuristic search algorithm [t] 
is used to compute ([5| approximately. However, its perfor- 
mance cannot be guaranteed. Furthermore, as reported in 
[14| , when x or n is large, its computed paths perform poorly 
even after incurring a huge amount of search time and space. 

Approximate MEPP(m). To establish a trade-off be- 
tween active sensing performance and computational effi- 



ciency, the key idea is to exploit a property of the covariance 
function that the spatial correlation of measurements 
between any two locations decreases exponentially with in- 
creasing distance between them. Intuitively, such a property 
makes the measurements Zx^ to be observed next in col- 
umn i near-independent of the past distant measurements 
Zxx-i^m-\ observed from columns 1 to i — m— \ (i.e., far from 
column I) for a sufficiently large m by conditioning on the 
closer measurements Zxi_^.i_^ observed in columns i — m to 
i— 1 (i.e., closer to column i). Consequently, H(Zxi\Zx^.^_i) 
can still be closely approximated by H{Zxi\Zxi_^^.i_^') after 
assuming a m-th order Markov property, thus yielding the 
following approximation of the joint entropy iif(Z2:j ,,^) in (|5|: 



^ H{Zx,.,^) + Er=™+i ii(ZxAZx,_^^,^ 



(6) 



The first equality is due to the chain rule for entropy [s]. 
Using H), MEPP ([5f can be approximated by the following 
stage-wise dynamic programming equations, which we call 
MEPP(m): 



Viixi-^.i-x) = niax H{Zx 



Vr,{x. 



n — m:n- 



-i) = max H{Zx^\Zx 



(7) 



for stage i = m-|-l,...,n— 1, each of which induces a corre- 
sponding optimal vector a;f of k locations given the optimal 
vector a;f_„.j_i obtained from previous stages i — m to i — iQ 
Let the optimal observation paths of MEPP(m) be denoted 
by x\.n that concatenates 

(8) 



Ci.^ = argmax H^Zx^,^) + Vm+i{x-i:, 



for the first m stages and I'^+i , . . . ,x^ derived using ([7|) for 
the subsequent stages m -I- 1 to n. Our proposed MEPP(m) 
algorithm generalizes that of [14] which is essentially MEPP(l) 



Theorem 1 (Time Complexity). Deriving x\,„ of 
MEPP(m) requires 0{x'^+^[n + {kmf]) time. 

The proof of Theorem [l] is given in Appendix |A.1| Unlike 
MEPP which scales exponentially in the planning horizon 
length n, our MEPP(m) algorithm scales linearly in n. 

Let uj\ and u}2 be the horizontal and vertical separation 
widths between adjacent grid locations, respectively, 1'^ = 
li/uji and ^2 — ^2/tiJ2 denote the normalized horizontal 
and vertical length-scale components, respectively, and 77 = 
a^/al. The following result bounds the loss in active sens- 
ing performance of the MEPP(m) algorithm (i.e., ([7| and 
Q) relative to that of MEPP Q: 

Theorem 2 (Performance Guarantee). The paths 
x\.„ are e-optimal in achieving the maximum entropy crite- 
rion, i.e., H{Zx* ) — H{Z^E ) < e where 



^ [k{n - 



' log<^ 1 



cxp{-(m+l)V(2£f)}^ 
77(1 + r,) 



The proof of Theorem [2] is given in Appendix |A.3[ Theo- 
rem|2]reveals that the active sensing performance of MEPP(m) 

^In fact, solving MEPP(m) |7| yields a policy that, in each 
stage i, induces an optimal vector for every possible vector 
Xi-m:i-\ (including possible diverged paths from xf_^.i_i 
due to external forces) obtained from previous m stages. 



can be improved by decreasing e, which is achieved using 
higher noise-to-signal ratio (i.e., noisy, less intense fields), 
smaller number k of robots, shorter planning horizon length 
n, larger m, and/or lower spatial correlation i'l along the 
transect. Two important implications result: (a) Increasing 
m trades off computational efficiency (Theorem[TJ for better 
active sensing performance, and (b) if the spatial correlation 
of the anisotropic field along the transect is sufficiently low 
to maintain a relatively tight bound e such that only a small 
m is needed, then MEPP(m) can exploit this spatial cor- 
relation structure to gain time efficiency while preserving 
near-optimal active sensing performance. In practice, it is 
often possible to obtain prior knowledge on a direction of 
low spatial correlation (refer to ocean and freshwater phe- 
nomena in Section [l] for examples) and align it with the 
horizontal axis of the transect. 

4. MUTUAL INFORMATION-BASED PATH 
PLANNING 

Notations. Recall that the team of k robots selects k lo- 
cations Xi to be sampled from column i of the transect for 
i = 1, . . . ,n. Let Ui denote a vector of remaining r — k 
unobserved locations in column i and denote a vector 
of the corresponding random measurements. Let Ui-.j and 
Zui-j denote vectors concatenating remaining unobserved lo- 
cations Ui, . . . ,Uj and concatenating corresponding random 
measurements Z„. , . . . , Zuj over stages i to j, respectively. 

Maximum Mutual Information Path Planning (M^IPP). 

An alternative to MEPP is to plan non-myopic observation 
paths xl-n that share the maximum mutual information with 
the remaining unobserved locations ul.„ of the transect: 

= argmax /(Zj;j.„; Z.ui,„) 

HZxi-,^ \ Zui.„) = H{Zui.^) — H{Zui.„\Zxi.„) ■ 

From ([9|, liZ^,.^^; Z^^.^J measures the reduction in entropy/ 
uncertainty of the measurements Zui.^ at the remaining un- 
observed locations ui:„ of the transect by observing the mea- 
surements .^3,1. „ to be sampled along the paths xi-^. So, the 



(9) 



path planning of M IPP (|9| is equivalent to the selection 
of remaining unobserved locations with the largest entropy 
reduction (i.e., determining Ui.„). This may be mistakenly 
perceived as the selection of remaining unobserved locations 
with the lowest uncertainty (i.e., minimizing posterior en- 
tropy term H{Zui-„\Zxi.„) in ([9|), which is exactly what 
the path planning of MEPP (|5| can achieve, as mentioned in 
Section [3] Note, however, that the maximum mutual infor- 
mation paths ([9| planned by M^IPP can in fact induce a very 
large prior entropy H{Zui.^) but not necessarily the small- 
est posterior entropy H{Zui.„\Zxi.^)- Consequently, MEPP 
and M^IPP exhibit different path planning behaviors and 
resulting active sensing performances, as shown empirically 
in Section O 

Similar to MEPP, M^IPP incurs exponential time in the 
length of planning horizon. To relieve this computational 
burden, we will describe an approximation algorithm for 
planning maximum mutual information paths next. 

Approximate M^IPP(m). We will exploit the same prop- 
erty of the covariance function (T| as that used by MEPP(m) 
(Section [3| to establish a trade-off between active sensing 
performance and computational efficiency for our M2IPP( m) 
algorithm. However, this is not as straightforward to achieve 
as that to derive MEPP(m) where a m-th order Markov 



property can simply be imposed on each posterior entropy 
term in ([6|. To illustrate this, using the chain rule for mu- 
tual information 3 , 

n — m— 1 
i— m + 1 

+ ^{Zxn-m:n'^ ^1^1:n\^^l:n-Tn-l) 5 

after which a m-th order Markov property is assumed to 
yield the following approximation: 

n — m—l 
i— m + 1 

+ ^"l:iil^a^n-2min-m-l) ' ^^^^ 

From (fTo|, note that each conditional mutual information 
term I(Zxi ; ^ui.„ l^^i-m-i-i ) cannot be evaluated individu- 
ally because the remaining unobserved locations ui-.n of the 
transect (specifically, ui-i-m-i and Ui+i-.n in the respective 
columns 1 to i — m — 1 and i + 1 to n) cannot be determined 
simply by knowing the robots' past and current sampling 
locations Xi-rn-.i-i and Xi in columns i — m to i. 

To resolve this, we exploit the same property of the co- 
variance function ^ as that used by MEPP(m) (Section[3| 
again: It makes the measurements Z^^ to be observed next 
in column i near-independent of the distant unobserved mea- 
surements and Ziij^^^j.^ in the respective columns 
1 to i — m — 1 and i + m + 1 to n (i.e., far from column 
i) for a sufficiently large m by conditioning on the closer 
measurements Zx^_^^.^_^ and ^Ui_„.i+,„ in columns i — m 
to i + m (i.e., closer to column i). As a result, each term 



_^ ) in 1 10 1 can be closely approximated 
f'or i = m + l,...,n — m—l: 



by I{Zx,; Zu,_„,,,_i_^, 

^{Zxf ; Zui-j^ \Zx 
= H{Zxi\Zxi_^ 
« H{Zxi\Zxi_,^^ 

= I[Zx^'t Zu^_^ 

where the approximation follows from the above-mentioned 
conditional independence assumption and the equalities are 
due to the definition of conditional mutual information [s]. 
Similarly, I{Zxi,^; Zui.„) and I{Zx„_„^.^; Zui.„\Zx„_,„,,„_^_ 
in ( 10 1 are, respectively, approximated by I{Zxi., 
! Zi. 



i-i) ~ H{Zxi\Zxi_^ 

+ JZx,_,„._,_,) 



! ■^■"l:2m ) 



and I{Zx^^_^ 



-..-..a:n\^Xn-2m.:-n,-r 
n — m — l 

+ I{Zx,;Z^^ 

i— m + 1 

+ HZx„_^,^;Zu^_ 

n-l 



^_J. Then, 

-^■.. + JZx,-, 
2,7^:T^l^a:^Tl-2m: 



-J 



(11) 



+ HZx„_^.,„ 

Using In]), M^IPP ^ can be approximated by the following 
stage-wise dynamic programming equations, which we call 
M2lPP(m): 

Ui{xi-2m:i-l) = max I{Zxi_^; Zui_2^.i\Zxi_2^.i-m-l) 
x^^X^ 

+ UiJ^l(Xi-2m. + \:i) 
Un{Xn — 2m:n — l)= maX ^ {Zx^_^.n ] Zun_2jn-n\ZxT^^2mn- jji- 1 

(12) 

for stage i — 2m + l, . . . , n — 1, each of which induces a corre- 
sponding optimal vector xf of k locations given the optimal 
vector xf_2m,:i-i obtained from previous stages i — 2m to 



i = 2m + l 
\ Zx 



i - Note that the term IiZx,_„,; Zn,_2^JZ^._^^.^_^^_-^) 
in each stage i can be evaluated now because the remaining 
unobserved locations Ui-2m:i in columns i — 2m to i can be 
determined since the robots' past and current sampling lo- 
cations Xi-2m:i-i and Xi in the same columns are given (i.e., 
as input to Ui and under the max operator, respectively). 
Let the optimal observation paths of M^IPP(m) be denoted 
by xf.„ that concatenates 



2^1:277 



argmax I{Zxi.,^^, Zu^. 



+ U2ni + l{xi:2m) (13) 

a;" derived using ( 12 1 



for the first 2m stages and a^fm+i, 
for the subsequent stages 2m + 1 to n. 

Theorem 3 (Time Complexity). Deriving xf,„ of 
M2IPP( m) regiiires 0(;^;^'"^"'" [n + 2(r(2m + 1))^]) time. 



The proof of Theorem [S] is given in Appendix |B.1[ Un- 
like M^IPP that scales exponentially in the planning horizon 
length n, our M^IPP(m) algorithm scales linearly in n. 

The following result bounds the loss in activesensing per- 
formance of the M^IPP(m) algorithm (i.e., ^ and |l3| ) 
relative to that of M^IPP 

Theorem 4 (Performance Guarantee). The paths 
x^.„ o.^^ e-optimal in achieving the maximum mutual infor- 
mation criterion, i.e., I{Zx* Z^* ) — I{Z^u Zja ) < e 
where , j 

r (m+l)" Y 

2m) '-">^ ' I- J 



k{n—2m) 



rn + 2^^^ 



ioga+- 



rj{l + fj) 



The proof of Theorem|4]is given in Appendix |B.3[ As shown 
in Theorem [4] decreasing e improves the active sensing per- 
formance of M2lPP(m); this can be achieved in a similar 
manner to that for decreasing the loss bound e of MEPP(m) 
(see paragraph after Theorem [2| since the two loss bounds e 
and e are similar. In addition, smaller number r of sampling 
locations in each column decreases e. M2lPP(m) shares 
the same implications as that of MEPP(m): (a) Increas- 
ing m trades off time efficiency (Theorem [3| for improved 
active sensing performance, and (b) M^IPP^m) can exploit 
a low spatial correlation I'l of the anisotropic field along the 
transect to improve time efficiency (i.e., only requiring a 
small m) while preserving near-optimal active sensing per- 
formance (i.e., still maintaining a relatively tight bound e). 

5. EXPERIMENTS AND DISCUSSION 

This section evaluates the active sensing performance and 
computational efficiency of the MEPP(m) (i.e., ([7| and ([8|) 
and M2lPP(m) (i.e., and ([l3|) algorithms empirically 
on two real-world datasets: (a) May 2009 temperature field 
data of Panther Hollow Lake in Pittsburgh, PA spatially dis- 
tributed over a 25 m by 150 m transect that is discretized 
into a 5 X 30 grid [Tt], and (b) June 2009 plankton density 
field data of Chesapeake Bay spatially distributed over a 
314 m by 1765 m transect that is discretized into a 8 x 45 grid 
[5]. These environmental phenomena are modeled by CPs 
with hyperparameters (i.e., horizontal and vertical length- 
scales, signal and noise variances) (Section l2.2|) learned us- 
ing maximum likelihood estimation (MLE)~2Q : (a) l\ — 



^Similar to MEPP(m), solving M2lPP(m) ^ yields a pol- 
icy that, in each stage i, induces an optimal vector for every 
possible vector Xi-2m:i-i (including possible diverged paths 
from x^_2m:i-i) obtained from previous 2m stages. 



(b) ^1 = 5 m, £2 = 16 m 



(a) i*! = 5 m, £2 = 5 m. 

(c) 'ei = 40.45 m, £2^ 5 m. (d) ii = 40.45 m°, £2 = 16 m. 
Figure 2: Temperature fields (measured in°C) dis- 
cretized into 5 X 30 grids with varying horizontal and 
vertical length-scales. 



the temperature field, and (b) ii = 27.53 m, €2 = 134.64 m, 
= 2.152, and cr^ = 0.041 for the plankton density field. It 
can be observed that the temperature and plankton density 
fields have low noise-to-signal ratios rj of 0.023 and 0.019, re- 
spectively. Also, though both fields are observed to be highly 
anisotropic, the spatial correlation of the temperature field 
is much higher along the transect than perpendicular to it. 
According to Theorems [2] and [4] such field conditions lead 
to loose performance loss bounds for both algorithms, which 
does not necessarily imply their poor performance. So, the 
empirical evaluation here complements our theoretical re- 
sults by assessing their performance-efficiency trade-off (i.e., 
by varying m) under these less favorable field conditions. To 
further investigate our algorithms' trade-off behaviors under 
different horizontal and vertical spatial correlations, the cor- 
responding length-scales £1 and £2 of the original tempera- 
ture field (Fig. [2|i) are reduced and fixed to produce three 
other modified fields (Figs. [2^, [Sja, |2];) with the signal and 
noise variances and learned using MLE. 

Comparison with Active Sensing Algorithms. The 

performance of our proposed algorithms is compared to that 
of state-of-the-art information-theoretic path planning algo- 
rithms for active sensing: The work of 13 has proposed the 
following greedy maximum entropy path planning (gMEPP) 
algorithm: 



V,^{xi..,.^) = max H(Z^^\Z^,.^_,) 



(14) 



for stage i = 1, . . . , n, each of which induces a corresponding 
optimal vector a;f of k locations given the optimal vector 
a;f.j_i obtained from previous stages 1 to i — 1. A greedy 
maximum mutual information path planning (gM^IPP) al- 
gorithm is devised by as follows: 

f/f(xi:,_i) = max I{Z^,.r,Z^,.^) (15) 

for stage i = 1, . . . , n, each of which induces a corresponding 
optimal vector xf^ of k locations given the optimal vector 
x'^i_i obtained from previous stages 1 to i — 1, and xi-.i 
denotes a vector of all sampling locations in the domain D 
excluding those of a::i;i. As mentioned earlier in Section [3] 
the work of [14| has developed MEPP(l), which is a special 
case of our MEPP(m) algorithm. 

In contrast to our MEPP(m) and MHPP{m) algorithms 
that scale linearly in the length n of planning horizon (The- 
orems [l] and [3]), deriving xi„ of gMEPP and a;^„ of gM^IPP 
incurs quartic time in n. Hence, if the required value of m 
is sufficiently smaU, then MEPP(m) and M2lPP(m) can be 
more efficient than the greedy algorithms, as shown below. 

Performance Metrics. The tested algorithms are eval- 
uated using three different metrics: The (a) entropy met- 
ric EN(a;i;„) = H{Zui.„\Zxi.^) and (b) mutual information 
metric MI(a;i:„) = I{Zxi.,^; Zui.„) measure, respectively, the 



Table 1: Comparison of EN(a::i;„), MI(a::i;„), and 
shown in Fig. |2|with varying number of robots, 
performance result is preceded by the value of 



ER(a;i;,i) (xlO~'') performance for different temperature fields 
For our proposed M'^IPP(m) and MEPP(m) algorithms, every 
m (in round brackets) used. 
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gMEPP: 


-64.8 


-128.4 


-173.3 


-182.4 


26.5 
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46.0 


39.5 
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gA-IEPP: xf.^ lis] 
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-132.9 
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45.8 
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0.018 


lVI^IPP(m.}: 


(1) -57.8 
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(1) -132.9 
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(2) 41.8 
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(1) 45.9 


(1) 36.9 


(1) 0.605 


(1) 0.265 


(1) 0.020 
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M:EPP(T7^): xf,^ 
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gM^IPP: ^jVLM 


-46.5 


-80.5 


-89.5 


-92.8 
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gMEPP: ^f^j^fl^ 


-46.3 


-80.6 


-89.5 


-93.2 


40.5 


60.6 


41.3 


28.6 


0.257 


0.024 


0.017 


0.009 


]Vl2lPP(m): 


(1) -46.5 


(1) -72.0 


(1) -89.4 


(1) -92.1 


(1) 40.8 


(1) 60.0 
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(1) 32.0 
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(1) 0.123 


(1) 0.016 


(1) 0.008 






(2) -89.5 








(2) 41.3 




(2) 0.229 




(2) 0.014 




IVIEPP(Tn): xf.^ 


(1) -45.9 

(2) -46.5 


(1) -81.3 


(1) -89.4 


(1) -93.5 


(1) 40.2 

(2) 40.8 


(1) 61.6 


(1) 38.7 
(4) 41.1 


(1) 28.2 

(3) 28.6 

(4) 29.0 


(1) 0.231 


(1) 0.014 


(1) 0.013 


(1) 0.007 



posterior entropy/uncertainty and the reduction in entropy/ 
uncertainty at the remaining unobserved locations of 
the transect given the observation paths xi-.n. The differ- 
ence between the entropy and mutual information metrics 
has been explained in the paragraph after ([9| in Section [4] 
The (c) ER(a;i:„) ^ \\zu,.,^ - fJ-u,..„\.^..„\\i/{'U^n{r - k)} 
metric measures the mean-squared relative prediction error 
resuhing from using the posterior mean iiu\xi-,^ ([2| to pre- 
dict the measurements at the remaining n(r — k) unobserved 
locations ui:„ of the transect given the measurements sam- 
pled along the observation paths xi-.n where Jl = Zui-„/ 
{n(r — k)}. It has an advantage over the two information- 
theoretic metrics of using ground truth measurements to 
evaluate if the phenomenon is being predicted accurately. 
However, unlike the EN(a;i:„) and MI(a;i:„) metrics that ac- 
count for the spatial correlation between measurements at 
the unobserved locations ui:„, the ER(2;i:„) metric assumes 
conditional independence between them. In contrast to the 
ER(2;i;„) metric, the EN(a;i;,i) and MI(a::i;„) metrics conse- 
quently do not overestimate their uncertainty. 

5.1 Temperature Field Data 

Table[l]shows the resuhs of EN(2;i:„), MI(xi:„), and ER(a::i;„) 
performance of tested algorithms for temperature fields with 
different horizontal and vertical length-scales (Fig. [2| and 
with varying number of robots. For our proposed M^IPP(m) 
and MEPP(m) algorithms, the results are reported in an in- 
creasing order of m until the performance has stabilized. It 
can be observed from Table [T] that MEPP(m) with m > 1 
or M^IPP(m) often outperforms MEPP(F 
metrics, as discussed and explained later. 
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in the three 
Note that every 
increment of m increases the length of history of sampling 
locations considered in each stage by two for M^IPP(m) in- 
stead of by one for MEPP(m); this can be seen from the 
inputs to Ui {12 \ and Vi ([7|, respectively. The observations 
of the results are detailed in the rest of this subsection. 

5.1.1 Entropy Metric EN(a;i:„) 

As expected, the entropy-based MEPP(m) and gMEPP 
algorithms generally perform better than or at least as well 



(a) 1 robot. (b) 2 robots. (c) 3 robots. 

Figure 3: Graphs of incurred time by different active 
sensing algorithms vs. m for temperature fields with 
varying number of robots. 

as the mutual information-based M^IPP(m) and gM'^IPP 
algorithms in this metric. 

For fields a, b, and d (i.e., of small £i or large £2) with any 
number of robots, MEPP(m) can produce EN(a:i.,i) values 
lower than or comparable to that achieved by gMEPP and 
gM^IPP using small values of m (i.e., m = 1 or 2), hence in- 
curring 1 to 4 orders of magnitude less computational time, 
as shown in Fig. |3] This can be explained by one of the 
following reasons: (a) A low spatial correlation along the 
transect cannot be exploited by gMEPP and gM^IPP, which 
consider the entire history of past measurements for improv- 
ing active sensing performance; (b) a high correlation per- 
pendicular to the transect can be exploited by MEPP(m) for 
better active sensing performance; and (c) unlike the greedy 
gMEPP and gM^IPP algorithms, MEPP(m) is capable of 
non-myopic planning to improve active sensing performance. 

For field c (i.e., of large £1 and small £2) with 1 robot, 
MEPP(m) cannot exploit the low spatial correlation perpen- 
dicular to the transect for improving active sensing perfor- 
mance. Therefore, it needs to raise the value of m up to 4 in 
order to better exploit the high spatial correlation along the 
transect. Consequently, MEPP(m) can achieve EN(a;i.„) 
performance comparable to that achieved by gMEPP and 
gM'^IPP while incurring similar computational time as gMEPP 
and about 2 orders of magnitude less time than gM^IPP. In- 
creasing the number of robots allows MEPP(m) to achieve 
EN(a;i.„) performance comparable to that of gMEPP and 
gM^IPP using smaller values of m (i.e., m = 1 or 2), hence 
incurring 1 to 4 orders of magnitude less time. 




Table 



2: 



Comparison of EN(a::i:„), MI(a;i;„), and 
(xlO^'^) performance for plankton density 



Figure 4: Plankton density (chl-a) field (measured 
in mg m^'') discretized into a 8 x 45 grid. 

5.7.2 Mutual Information Metric MI(a;i;„) 

The mutual information-based M^IPP(m) and gM^IPP 
algorithms often perform better than or at least as well as 
the entropy-based MEPP(m) and gMEPP in this metric. 

For fields a, b, and d (i.e., of small £i or large £2) with any 
number of robots, M^IPP(m) can generally yield MI(a::",j) 
values higher than or comparable to that achieved by gM^IPP 
and gMEPP using a small m value of 1, hence incurring less 
computational time (in particular, about 2 orders of magni- 
tude less time than gM'^IPP), as shown in Fig. [s] This can 
be explained by the same reasons as that discussed previ- 
ously in Section [5. 1.1 1 

For field c (i.e., of large £1 and small £2) with 1 or 3 robots, 
M'^IPP(m) cannot exploit the low spatial correlation per- 
pendicular to the transect for improving active sensing per- 
formance. So, it has to increase the value of m to 2 in or- 
der to better exploit the high correlation along the transect. 
As a result, M^IPP(m) can achieve MI(2;f.„) performance 
comparable to that achieved by gM'^IPP and gMEPP while 
incurring less time with 1 robot and slightly more time with 

3 robots than gM^IPP. With 2 robots, m = 1 suffices for 
M^IPP(m) to achieve MI(2;f.„) performance comparable to 
that achieved by gM^IPP and gMEPP while incurring less 
time (Fig.[3|. A computationally cheaper alternative for ac- 
tive sensing of field c is to consider using MEPP(m) with 
larger m: When the values of m are raised to 4, 2, and 

4 for the respective 1-, 2-, and 3-robot cases, it can pro- 
duce MI{xi.„) performance comparable to that achieved by 
gM^IPP and gMEPP while incurring similar or less time. 

5.1.3 Prediction Error Metric ER(a;i:„) 

For field c (i.e., of large £1 and small £2) with any num- 
ber of robots, MEPP(m) and M^IPP(m) cannot exploit the 
low spatial correlation perpendicular to the transect for im- 
proving active sensing performance. Hence, their values 
of m need to be raised in order to exploit the high corre- 
lation along the transect. Compared to M^IPP(m), it is 
computationally cheaper (Fig. [3| and offers greater perfor- 
mance improvement (Table [T]) to increase the value of m of 
MEPP(m), which can then produce ER(a::i.„) values lower 
than that achieved by gMEPP and gM^IPP while incurring 
similar computational time to gMEPP and about 2 orders 
of magnitude less time than gM^IPP with 1 robot and 1 to 4 
orders of magnitude less time than both with 2 or 3 robots. 
For field d (i.e., of large £1 and large £2) with any num- 
ber of robots, MEPP(m) can now exploit the high spatial 
correlation perpendicular to the transect for better active 
sensing performance. As a result, MEPP(m) can yield bet- 
ter ER(a;i.„) performance than gMEPP and gM^IPP using 
smaller values of m (i.e., m = 1 or 2), hence incurring 1 to 
4 orders of magnitude less time. 

For fields a and b (i.e., of small ii) with 1 or 2 robots, 
M2lPP(m) can produce ER(a;i'.„) values lower than or com- 
parable to that achieved by gM^IPP and gMEPP using a 
small m value of 1, hence incurring less time (in particu- 
lar, about 2 orders of magnitude less time than gM^IPP), 
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as shown in Fig.^ Increasing to 3 robots allows MEPP(m) 
to achieve ER(a;i.„) performance better than or compara- 
ble to that of gMEPP and gM^IPP using a small m value 
of 1, hence incurring 3 to 4 orders of magnitude less time 
(Fig. [3|. These can be explained by the same reasons as 
that discussed previously in Section [5. 

5.2 Plankton Density Field Data 

Table|2]shows the results of EN(a;i:„), MI(a;i:„), and ER(a;i:„) 
performance of tested algorithms for the plankton density 
field (Fig. [4| with varying number of robots. For our pro- 
posed M^IPP(m) and MEPP(m) algorithms, the results are 
only reported for m = 1, at which their performance has al- 
ready stabilized. As mentioned earlier in the first paragraph 
of Section[5] the plankton density field exhibits low and high 
spatial correlations, respectively, along and perpendicular to 
the transect, which resemble that of temperature field b. 

The observations are as follows: With any number of 
robots, MEPP(l) can produce EN(a;f.„) values lower than 
that achieved by gMEPP and gM'^IPP while incurring 2 to 
5 orders of magnitude less time, as shown in Fig. [5] On 
the other hand, M2IPP(1) can yield MI(a;^„) and ER{xf,„) 
performance better than or comparable to that achieved by 
gM^IPP and gMEPP while incurring less time (in particu- 
lar, about 2 orders of magnitude less time than gM^IPP) 
(Fig. [5|. These can be explained by the same reasons as 
that discussed previously in Section [5. 

5.3 Summary of Test Results 

The observations of the above results are summarized be- 
low: For anisotropic fields with low spatial correlation along 
the transect (e.g., temperature fields a and b and plankton 
density field), MEPP(m) can perform better or at least as 
well as gMEPP and gM^IPP in the prediction error (i.e., 
with 3 robots) and entropy metrics using small m values of 
1 or 2, hence incurring 1 to 4 orders of magnitude less time. 

can generally perform likewise in the prediction 
error (i.e., with 1 or 2 robots) and mutual information met- 
rics using a small m value of 1, hence incurring less time as 
well (in particular, 2 orders of magnitude less time than 
gM'^IPP). These observations are previously explained in 
Section [5. 1.1 1 Note that they corroborate the second impli- 
cations of Theorems [2] and [4] on the performance guarantees 
of MEPP(m) and M^PP(m). 

For anisotropic fields with high spatial correlation along 
the transect (e.g., temperature fields c and d), a larger m 
value is needed in order for MEPP(m) and M'^IPP(m) to ex- 
ploit it if the correlation perpendicular to the transect is low 
(i.e., field c). Compared to M^IPP(m), it is computationally 
cheaper to increase the value of m of MEPP(m) such that it 
performs better or at least as well as gMEPP and gM^IPP in 
all three metrics while incurring similar time to gMEPP and 
about 2 orders of magnitude less time than gM^IPP with 1 
robot and often 1 to 4 orders of magnitude less time than 
both with 2 or 3 robots. If the correlation perpendicular to 
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(a) 1 robot. (b) 2 robots. (c) 3 robots. 

Figure 5: Graphs of incurred time by different active 
sensing algorithms vs. m for plankton density field 
with varying number of robots. 

the transect is high (i.e., field d) instead, it can be exploited 
by MEPP(m) and M^IPP(m) to improve active sensing per- 
formance and consequently allow m to be reduced to small 
values of 1 or 2: MEPP(m) can perform better or, if not, 
at least as well as gMEPP and gM^IPP in the prediction 
error and entropy metrics while incurring 1 to 4 orders of 
magnitude less time. M^IPP(m) can perform likewise in 
the mutual information metric while incurring less time (in 
particular, 2 orders of magnitude less time than gM'^IPP). 

6. CONCLUSION 

This paper describes two principled information-theoretic 
path planning algorithms based on entropy and mutual in- 
formation criteria (respectively, MEPP(m) and M'^IPP(m)) 
for active sensing of GP-based anisotropic fields. Two im- 
portant practical implications result from the theoretical 
guarantees on the active sensing performance of our algo- 
rithms (Theorems [2] and [4]| : Increasing m trades off com- 
putational efficiency (Theorems [l] and |3| for better active 
sensing performance, and our algorithms can exploit a low 
spatial correlation along the transect to improve time effi- 
ciency (i.e., only needing a small m) while preserving near- 
optimal active sensing performance. This motivates the use 
of prior knowledge, if available, on a direction of low spatial 
correlation in order to align it with the horizontal axis of 
the transect. Empirical evaluation of real- world anisotropic 
temperature and plankton density field data reveals that 
our algorithms can perform better or at least as well as 
gMEPP and gM^IPP while often incurring a few orders 
of magnitude less time. In particular, it can be observed 
that anisotropic fields with low spatial correlation along the 
transect or high correlation perpendicular to the transect 
allow our algorithms to perform well using small values of 
m, thus yielding significant computational gain over gMEPP 
and gM^IPP. To perform well in a field with high correla- 
tion along the transect and low correlation perpendicular to 
the transect (i.e., less favorable conditions), our algorithms 
have to increase the value of m or the number of robots but 
can still achieve comparable or better time efficiency than 
gMEPP and gM^IPP. 
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APPENDIX 

Notations. Let = a^^ and cr^|g = in (jsj) for any 

location x. Let ^ = exp{-(m + 1)^/(2/1^)}. 

A. ENTROPY-BASED PATH PLANNING 
A.1 Proof of Theorem [D 

Given eacli vector Xi-m-.i-i, the time needed to evalu- 
ate the posterior entropy H(Zxi\Zx^_^.^_-^) over all possible 
a;i G A'i is X X 0{(kmf) = C'(x(fcm)^)- The time needed to 
perform this over all x™ possible vectors Xi-m:i-\ in each 
stage i is X C'(x(fcm)=') = C»(x'"+^(fcm)='). Since the 
covariance function is stationary (i.e., it only depends on 
the distance between locations), the entropies calculated in 
a stage are the same as those in every other stage. The time 
needed to propagate the optimal values from stages n — 1 to 
m + 1 is C'(x'"^^ (n — m — 1)). To obtain the optimal vector 
Xi-m, the joint entropy H{Zxi.^) has to be evaluated over 
all possible vectors xi-.m- Hence, the time needed to solve 
for the optimal vector x^.^ is 0{x"^{kmf). As a result, the 
time complexity of the MEPP(m) algorithm is 0{x"^+^{{n- 
m - 1) + [kmf] + x'^ikmf) = ©(x^+M"' + (kmf]). 

A.2 Proof of Some Lemmas 

Before giving the proof of Theorem [2j the following lem- 
mas are needed. 

Lemma 5. For any observation paths xi:„, 

n 

i — TTl + 1 

n 

1 — m + 1 

Proof. Using ([7|, 

Vr„+l{Xi;rn) 

= max H{Zx^^^\Z^^.^) + V„^+2iX2:m + l) 



max H{Zx^^ 



1 + 1 l^^ly. 



max H{Zx_^^\Z., 



max 



+ 21^^3:2™ + !) + Vm + 3ix3:m + 2) 
Z^,.„) + 



H{Zx^^^\zjxi,, 



H{Z^ 



1 + 2 l^a:2:„ + 



1) + Vm + 3{x3:m+2) 



max y H{Z^^\Z^ 



(16) 



Given x\:ra, the vectors Xm+i, . . . , 3;„ that maximize the 
term ^JLm+i -^(^^i 1^^;-™ ;-! ) ( |16[ ) can be obtained. 
Using ((8|, the observation paths that maximize li(Z^^.^ ) 
+ X]r=m+i ^(^s'i l^^i-m i-i ) be obtained. Therefore, 
Lemma [5] holds. □ 

Lemma 6. In a GP, given an unobserved location y and 
a vector A of sampled locations, o'y^A — 

Proof. We know that o-^i^ > 0. So, if o'f^ > 0, 



(17) 



where the covariance components in the diagonal of Eaa are 



o"? + o"n. On the other hand, if 

2 



0, 



where Ess — Saa - 
Let A = Eaa,B 
and W ^ S^JjEa^ ^ 



= Ebs,E = (T„/,Y = Eaj;,Y 

= A-^Y. Then, Y = AW. 



(18) 

Ej,A, 

(19) 
(20) 



W ' EW + W ' E ' B^'EW > 
^ W^B^B^^EW + W^E^B^^EW > 
^ W^(B + E)^B"^EW > 
^ W^A^B^^EW + W^A^W > W^A^W 
^ W^A^(B"^E + /)W > W^A^W 
^ W^A^B"^(E + B)W > W^A^A^^AW 
^ (AW)^B"^AW > (AW)^A"^AW 
^ Y^B^^Y > Y^A^^Y 

=> Ei,aEssEaj/ > EyAEAA^Aj/ . (21) 

To derive ([ig]), since W is a vector and E = cr^J, W^EW > 
0. Since B is a covariance matrix that is invertible and 
positive semi-definite, B^^ is positive semi-definite. Hence, 



W"^E"^B-^EW > and |l9| t herefore holds. Since B is 
symmetric, B^ — B. Hence, ( 20 1 can be obtained from ( 19 1 



The rest of the derivation from ( |20[ ) to \21\ is straightfor- 
ward. From ( 17 1, 



^2,2 
2 



Ej/aEaaEah 

EyAEss^Ay 



(22) 

(23) 

Note that ([22| and ||23| follow from (|2l]) and ([l8|, respec- 
tively. Therefore, Lemma [6] holds. □ 



Lemma 7. H{Zx _ _-,\Z^ 



'H{Z^ 



< fc" log n 



77(1 -fry) 



Proof. We will first prove for the single-robot case. This 
result will be used later for the multi-robot case. Let xa — 

Xi — m:i—\ and Xp — Xi — rn — 1- 



2 



(Tg exp 



2 T 2 



„2c2 



(24) 

The first inequality follows from the property that variance 
reduction is submodular ^4^ in many practical cases (e.g., 
further conditioning on Zxj^ does not make Zx^, and Zx^ 
more correlated). To intuitively understand this notion of 
submodularity, observing a new location Xi will reduce the 
variance at location Xp more if few or no observations are 
made, and less if many observations are already taken (e.g.. 



at locations xa)- The second equality is due to (|3|. The 
second inequality follows from the fact that the distance 
between any two locations from stage i and stage i — m — 1 
is at least a;i(m + 1). So, a^pXi < exp{ — {m + l)^ /{2£'i)}. 



= H{Zx^\Zj:^) — H{Zx^\Zj:^, Z^i) 



-1 1 ^^i 



I ^ A 



^ ^ iOg ^ 



log O + 



< 2 <! 1 



< log U + 



x^\(xA.Xi) 



(1 + 77) 



r?(l + 7?) 



(25) 
(26) 

(27) 
(28) 



Using g, ([25f can be obtained. Inequality ( |26[ ) results from 
(24 1. Inequality (27 1 can be obtained using Lemma [6] 

We will now prove for the fc-robot case where fe > 1. Then, 
vectors Xi and x^ comprise fc locations each. Let x\ (Xp) de- 
note the j-th location component in vector Xi (xp). Let a;^ * 
(xp*) denote a vector comprising the j-th to t-ih location 
components in vector Xi (xp). Using the chain rule for en- 
tropy [3], 



H(Zxp\Zxa) — H{ZxJZxA, Zxi) 

= H[Z,i\Zx^,) - H{Z,i\Zx^,Zx,) + (29) 

k 

J2^^^xi\^xl^^-''^^A) - H{Z^jJZ^l-.j-l,Zx^,Zx^) . 
J=2 



(30) 



J2H{Z^,JZx,^)-H{Z^,JZx,^,Z^y.. 



i=2 



= E ^(^.j ) - mz^^pizx., , ) + 

J=2 

H{Z^2:k\Z_^j^,Zx^. , Z^l) - H{Z^2:k\Zx^. , ^^l)] 

= E ^(^.J ) - H{Z^^\Zx,^ , Z,. ) + 

J=2 

//(Z^2:fc|Za;^^. , ^^l) - H{Z^2:k\Z_^j^,Zx^. , Z^l) 
k 

-J2HiZ^,JZx^,^)~ H{Z^,JZx^,^,Z^^) + 

j=2 
fc 

^//(Z^t|Za:^^ , Z^l:t-l) - jy(Z^t|Z^j , Zj;^^ , Z^ 
t=2 



< k(k - l)log n + 



77(1 + 77) 



(32) 



The inequality follows from a derivation similar to ( 28 1 
Combining ( |31[ ) and ( |32[ ), Lemma[7]results. □ 

Corollary 8. For t = I, . . . ,i — m — 1, 

H{Zx,\Zx,+,:,_,) - H{ZxAZx,+^.,,^^, ZxJ < log 1 1 + 



+ --^^ , 7,(1+77) 

Proof. The proof is similar to that of Lemma [7| □ 
Lemma 9. 

e 



H{Zx,\Zx,_^._,_,)~H{Zx,\Zx,,,_,) < (7-7r7-l)fc"log|l 
Proof. Using the chain rule for entropy [3], 

Z x f] Z x ^ _ _ i) 

— H{Zxi\Zxi_,„.i_i) + H{Zx:^-i_„^_:^\Zxi_^.i_i, Zxi 



7,(1 + 77) 



(33) 



For (|29|, 

H{Z,i\Zx^)-HiZ^i\Zx^,Z^i:k) 
= H(Z^.\Zx^)~[H{Z^.\Zx^,Z^i) + 

H{Z^2:k\Z^l, ZxA, Z^l) ~ H{Z^2:k\ZxA,Z^l)] 

= H{Z^i\Zxa) - H{Z^i\ZxA,Z,^i) + 

H{Z^2:k\ZxA,Z^l) - f/'(Z^2:fc|.^3;l,Z^_4,^^l) 
= ff(Z,llZ.J-i/(Z.l|Z.^,Z,l) + 
k 

'^H{Z^t\Zxji,Z^i:t-i) - H{Z^t\Z^i,ZxA,Z^i 



< log n + 



7,(1+77) 



(31) 



The inequality follows from a derivation similar to (28 1. 
Let XAj denote a vector concatenating Xp'-'^^ and xa- For 



^{Zxi-i^jjT^^l : Zxi\Zxi_„^.f_i) 
= H(Zxi.i_„^_i \Zxi_„i:i-l) + -'^(^a;i|^a:i.i_„ 
= -ff (^ii:i_„_i ) + "'^ (^a: i I -^a: 1 : i - 1 ; 

From (|33l) and dMl, 



-U Zxi 



(34) 



= H{Zxi.i_^_i \ Zxi_^.^_i ) ~ H(^Zxi-i_^-,^_i \Zx^_^^.i_i Zxi) ■ 

(35) 

Applying the chain rule for entropy to ( |35[ ), 

5 ) 

i — m — 1 

= E ^(Zxt\Zxt+i:,-i) - H{Zxt\Zxt+i,i_i, Zx,) (36) 



< (i - TTl - l)fc^ log <! 1 + ^ 



77(1 + 77) 



(37) 



The inequality 1 37 1 follows from Corollary [s] So, Lemma [9] 
holds. □ 



A.3 Proof of Theorem |2] 

Let 

n 

i— m+1 
n 

{HiZ^,J+ J2 ^(^--l^-*-™.-!)} • (38) 

i — m + 1 

From Lemma [sj ^ > 0. By the chain rule for entropy [s], 
H{Z,.J-H{Z^.J 

n 

z— m + 1 
n 

{H{Z^.J+ J2 HiZ^.\Z^,^J}. (39) 

i — m + 1 

Let A* ^ - and Af ^ 

/f(Z,E|Z,E___j - hIzIeIZ^^^^J ior i = m+l,...,n. 
Then, ( |39[ ) can be re-written as 

^HiZ,,J+ [//(Z,.iZ,.___J-A*] - 



i— m + 1 

i — m + l z — m + 1 

i— m + l i— m + l 

= J2 K-A*]-9 (40) 

i— m + 1 

< ^ [A^-A*] (41) 



i— m + 1 

< E A? 

i— m + 1 



(42) 



Since 9 > 0, (41 1 results. Since A* > for i — m + I, . . . ,n, 
( [42| follows. By Lemma [9j 



Af < (i - m - l)k^ log n + 



e 

for i — m + 1, . . . ,n. Then, Theorem [2] follows. 

B. MUTUAL INFORMATION-BASED PATH 
PLANNING 

B.l Proof of Theorem H 

Given each vector Xi-2m:i-i, the time needed to evaluate 
I{Zxi_m\ Zui_2m-i\Zxi_2m-i-m-i) ^vcr all posslblc Xi € Xi is 
XxO{[r{2m + l)f) = 0{x[ri2m+l)f). The time needed to 
perform this over all x^"^ possible vectors Xi-2m:i-i in each 
stage i is x'"" X 0(x[r(2m + 1)]^) = 0(x^'"+' [r(2m + 1)]^). 
Similar to the MEPP(m) algorithm, the conditional mu- 
tual information terms calculated in a stage are the same 



as those in every other stage. The time needed to prop- 
agate the optimal values from stages n — 1 to 2m -I- 1 is 
0{x'^"^'^^ {n — 2m — 1)). Similarly, the time needed to eval- 
uate I{Zx^_^.„; Zu„_2^.,n\Zx,^_2m:n-m-i) ^vcr all possiblc 
Xn G Xn and all x possible vectors Xn—2m:7i—i in stage 
n is 0{x'^"^^^[r{2m + 1)]^). To obtain the optimal vector 
^'i:2mt m) 2m) ^as to be evaluated over all possi- 

ble xi:2m- Hence, the time needed to solve for the optimal 
vector xf.2m is C'(x^™['"(2m)]^). As a result, the time com- 
plexity of the M^IPP(m) algorithm is 0(x^'"+^(n - 2m - 
1 + [r(2m + 1)]3) + x^"'+^[r{2m + 1)]^ + x^"'[r{2m)]^) = 
0{x^"'+\n + 2[r{2m + l)f)). 

B.2 Proof of Some Lemmas 

Before giving the proof of Theorem [4] the following lem- 
mas are needed. 

Lemma 10. For any observation paths xi;„, 

n-l 
i=2m + l 

+ ^(Z^«_^^ jZ^u_^^ JZ^m_^^^^^^_^_^) > 
n-1 

HZxi-rnt Zui,2m.) ~^ Yl ^(Zxi^rnt Zui_2m.:i\Zxi_2m:i-m-l) 
i = 2m + l 

+ ^{Zx„_rn:n ' Zu„_2m:n \ ^ X„ _2m:n - m - l) ' 

Proof. Using ([l2|, 

U2m + l{xi: 2m J 

= max I{Zx^_^j^; Zui.2„ + i\Zxi:„,) + U2m + 2{x2:2m + l) 



max I{Zx„^^i': Zu^.2^^1 



X2m + ieX2ra + l 

max I{Zxm + 2J Zu2-2m + 2\Zx2-,„+l) + U2m + 3{x3:2m + 2) 

= V ^'^^ -^(^^m+i;^l"l:2m + ll^^l:m) + 

a;2TTi + lt'T^2in + l!3:2m + 2fc'^^2m + 2 

i{Zx^^2 \ Zv.2.,2„,^2\Zx2-.^^x) + t''2m + 3(a::3:2m + 2) 



n-1 

mS-X > ^(Zxi_^\Zy.i_2-„^-i\Zxi_ 



-l) + 



i^2m + l 



l^.„-2m:„-™-l) • (43) 

Given a:i:2m, the vectors X2m+\, ■ ■ ■ ,Xn that maximize the 
term Er=2m+i ^(^^,-™ ^ ^"i~2™:, l^^i-2,jt^m-i ) + 
-^(^a;„_„.„ ; Zu„_2r„ n\Zx„_2m-n-m-i) ID 1 43 1 cau bc obtalucd. 
Using ( |13[ l, the paths a;i:„ that maximize /(^ii.„; ^1112™)+ 

a^i-2m:i-m-l ) + 

i-m-i) can be obtained. There- 



HZx 



fore, Lemma [To] holds. □ 

Lemma 11. _fbr t = 1, . . . , i — 2m. — 1, 

H{Zxt\Zxt^-l^.i_^_-^ , •^Ui_2777:i) ~ ^iZxtlZxt^i,, 



-1 1 ^"!-277i:i 1 



^-.-,77) 



< log <^ 1 



Proof. The proof is similar to that of Corollary |8] □ 
Corollary 12. For t = I, . . . ,i ~ 2m — 1, 

H{Zut\Zxi-i_^_i , -^ut+i.i) ^ H{Zut\Zxi.i_,„_i, Zut^i-i , Zxi_^, 

e 



< k{r - fc)log <^ 1 + 



7,(1-^7?) 



Proof. Note that the size of vector ut is r — fc. The proof 
is similar to that of Corollary |8] □ 



Corollary 13. For t ^ i + 1 
< fc(r - fc)log |l + ^ 



,n, 



_1 ! ■^1ii:f „i 1 Zx^_^ 



77(1 + r?) 

Proof. The proof is similar to that of Corollary \T2\ □ 



Lemma 14. 
<{n~2m- l)rk log <j 1 + 



-1 : ^Ui^2n 



-1 > ^^Ir. 



77(1+7?) 



Proof. Let X/\ (""a) denote a vector of all the locations 

of a;i:i_m-l (Ul:n) eXCludlng thoSe of a;i_ 2m:i-m- 1 {Ui-2m:i)- 

That is, a;A = xi;i-2m-i and «a — (7ti:i-2m-i, 



H{Zxi_^\Zxi_2,rt;i-rn- 

— H{Zxi_^\Zxi_2r„A 

[H{Zxi_^\Zxi_2^;i 

H{Zx^ , Zu^\Zxi_2 
H{Zx^ , Zu^\Zxi__2 

— H{Zx^ , Zu^\Zxi_2 



1 1 -^U4_2t7 



-1 ' -^^^1-271 



-1 1 ■^Ui-2m:i)] 



_1 1 



H{Zx^ , -^UA !^a;i_2m:i-m-l 1 ■^'"i-2m:i I 



2m:i-jn-l 1 ^'"i-2,j 



[H{Z^a\Z 



+ 



i — 2m— 1 

— [H{Zxt\Zxt + l:i-m-n^U 



H{Zxt\Zx 

-2m-l 



-1 ' ^^i-2m:i 5 "^^i 



-1 1 + 



+ 



-1 > ^^1: : 



(44) 
(45) 

(46) 



where 

Ai — m ~ -^(■^^i-Tii I ■^3^i-2m:i-m-l ' ■^■l^i-2m:i ) 

Bi-m = H{Zxi_^\Zxi_2^,i_,ri-l) ~ -'^(^^i-m ) 

and 

e 



At-m < (77 - 2777 - l)rfclog <^ 1 + 



77(1 + 77) 



B«-m < (i - 2777 - l)fc^ log n 



77(1+77) 



Proof. By the definition of conditional mutual informa- 
tion, 



— H{Zxi_^\Zxi.^_^_-^) — H{Zxi_,„\Zxj^.i_,„_-^ y^-ui.n) I 



I{Zxi_^ ; Zui_2m:i\Zxi-2m:i-m-l) 
— H [Zxi_„^\Zxi_2,T^A-m-l) ~~ H{Zxi_^\Zx 



(50) 
(51) 



Using (|50|) and ||51|, 

^{Zxi^^ \ Zu-Y.j^\Zx-i.i~Tn-l) ^^Z^i-m'-} Z^i-2iTi:i\Z^i-2m:i-m-l) 
— Ai — ra Bi — m • 

By Lemma [14] 



^,_m < (77 - 2777 - l)rfclog <^ 1 + 



77(1 + 77) 



Using Lemma [9j 



B,-m < (i - 2777 - l)fc^ log <{ 1 + ^ 



77(1 + 77) 



Therefore, Lemma [Ts] holds. □ 

B.3 Proof of Theorem H 

Let 



H{Zu,\Z 



-1 I Z^t+l-.i 1 -^a: 



+ 



t=i + l 

< [(i - 2777 - l)fc^ + (77, - 2771 - l)(r - k)k] log <{ 1 + 



< (77- 277l-l)[fc' + (r-fc)fc]log|l+— .1 

I '?(! + ??) J 



(77 — 2771 — \)rk log -11 + 



(47) 

77(1 + 77) 
(48) 



(49) 



77(1 + 77) 

Using the chain rule for entropy (44 1, (461, and (47 1 can 
be obtained. Using Lemma [Tl] and Corollaries [12] and |13[ 
can be obtained. □ 

Lemma 15. 

— -Ai — jfi Bi — m 



i=2m+l 



I{Z^m \Z^m \Z^ 



-2m:n~m-l 



i^(^^i.^'^<2J+ E ^(^-?-.J^":-2™J^-f-2™.-,„-i) + 



HZ 



X _ .7 



\Zx 



i-2m:n-m-l ^ 



(52) 



From Lemma [Tol 6 >Q. By the chain rule for mutual infor- 
mation [3j, 

^iZxl.^,Z^*J) - I(Z^m^,Z^mJ 

n-1 

= I{Zxi^;Z^<^J+ E ^(^<-„;^-L„l^-^.-™-i) + 

i=2m + l 
n-1 

[/(^^M^; Z„M^) + E ^^^xf-m.' ^^f-j^'^f;-rr,-j ~^ 



i=2m + l 



7(Z^H ^ ;Z„M \Z^ 



(53) 



By the definition of mutual information, 

I{Z^,..^-Zu,.J ^ H{Z^,.^) - H[Z^,. JZ^,.J , (54) 
HZa:,:^;Z^,.,^) ^ H{Z^,. J ~ H{Z^,. JZ^,.,^) . (55) 

Using the chain rule for entropy [s], 



H{Zxi.^\Zui.2^ ) 
= H{Zxi\Zui.2^] 
H{Zxm\Zx 



H{Zxi.,„\Zui.„) 
~ H{Zx,\Zu,:J + ...+ 

' H[Zxjri |-^a;i:m-l 1 Zuiy 



— HiZxt\Zxi-t_i, Zui.2m) ~ H{Zxt\Zxi-t_i, Zui.„) 



< m{n — 2m) (r — fc)fclog s 1 + 



r?(l + 77) 



(56) 



Inequality ( |56| l can be obtained using a proof similar to 
Lemma 14 Applying (56 1 to (54 1 and (55 1, 



where 



^{Zxi.jn i -^^^1:71 ) — ^{Zxi-jn 5 Zui.2m ) — ^1- 



vii + v) 



(57) 



(58) 



Ai:m < m{n — 2m)(r — k)k\og | ^ + 
By the definition of mutual information, 

(59) 

= -ff(^a:„_„:„!^a:„_2m:„-m-l) — 



(60) 



Using the chain rule of entropy, 

H{Zx„_^-„\Zx^_2„^:„-rn-l) ~ H{Zx,^_^-„ \Zxi,, 
n 

~ ^ H{Zxt\Zx^_2m:t-l) ~ H{Zxf\Zxi,t-l, 



< (m+ 2m- l)fc^log|l + 



r;(l + 77) 



(61) 
(62) 



Using Lemma |9j inequality (62 1 can be obtained. 
By the chain rule of entropy, 

^{Zx^^^-n IZxi-n^jn^i , Zui-^) 



where 



An-m:„ < (m + l)(n — 2m — l)rfclog < 1 + 



B 

n — m-.n 

< (m + l)(n - 2m - l)k^ log n + 



'?(1 + '7)J ' 

(65) 

e ' 
77(1+77) 



(66) 



Using the above results, ( 53 1 can be rewritten as 



-2m:t-l 1 ^'^n-2n 



t — n — m 

< (m + l)(n - 2m - l)rfclog <{ 1 + 



J — H{Zxt\Zxi.t-i, Zui-n) 



77(1+77) 



(63) 



Using a proof similar to Lemma 14 inequality ( 63 \ can be 
obtained. Applying the results ( |62[ ) and ( |63| l to (59| and 
(601, 



HZx„_^-„ y Zui.„\Zxi.,^_^_l^ 
^iZxn_^-^j Zu^_2m:n\Zx^_2r, 
— -^n — m-.n Bn — m:n 



HZxi.^; Z^i^) - HZxfJ Z^M^) 
= Alr„, + I{Zxi^;Zui,^J + 

[^i-m — -Bi-m + HZx*_^; ■^"*_2m il'^^i-2m i-™-l)]~'" 

i = 2m + l 
n-1 

- + /(^^M_ ■,Z^M_^ ,.\Z^M_ _ _ )] + 



A-n-m-.n — Bn-m:n + I{Z,JA \ Z \Z ^ 



,)} 



(67) 



^l:m ^ ^ (^i — m -^z — m) (^n — m;n ^ n — ra'.n) 
i = 2m+l 
n-1 

L l:m ~r / ^ — m ^i — m) ' V n — Tn;n ^n — m:n)\ ^ 
i=2m+l 

(68) 

n — 1 71 — 1 

< 4* _|_ A* -1_ 4* -L \^ R^^ -L 

_ -^l:m I / ^ -'^i — m ~r -^n — m:n 1 / ^ ^i — m 1 ^n — m:n 

i — 2m+l i — 2m+l 

n — 1 71 — 1 

[4^ _|_ \ ^ I yjM I \ ^ R* I R* 1 

i— 2m+l i— 2r7i+l 

(69) 

n— 1 n— 1 

< 4* -I- 4* _i_ 4* 4_ d" I r" 



i=2m+l 



i=2m+l 



(70) 



< [m(7i - 2m)(r - fc)fc + (71 - 2m - 1 + m + l)(n - 2m - l)rk + 
^(77 - 2m - l)(7i - 2m - 2)fc^ + (m + 1)(77 - 2m - l)fc^] 



log U + 



(71) 



77(1+77) 

< [m(7i — 2m) (r — k)k + {n — m)(n — 2m)rfc + 

1 r f ^ 

-(n - 2m)(7i - 2m - 2)k^ + (m + 1)(77 - 2m)fc^] log n + — ^ 

2 t 77(1 + 77) 

= [m(7i — 2m) (r — k)k + (77 — m)(77 — 2m)rfc + 

1 r f ^ 

-(n - 2m) (n - 2m)fc^ + m(7i - 2m)fc^)] log <^ 1 + — ^ ^ 

2 -n^l + r)) 

1 r 

= [mr + (ri — m)r H — (^ ^ 2m)fcl(77 — 2m)fclog < 1 H 7 r 

2 l_ 77(1 + 77) 

1 r f" " 

= [rir + -(77 - 2m)fc](77 - 2m)fclog <^ 1 + ^ 



77(1 + 77) 



(64) 



Using (57 1, (64 1, and Lemma 15 ([67| can be obtained. Ap- 
plying e to ( [eTf , (|68| can be obtained. Since 61 > 0, ([69| can 



be obtained. Using ( |58[ ), (|65f, ( |66[ ), and Lemma |15[ 
and (711 can be obtained. Therefore, Theorem [4] holds. 



